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BACKGROUND 

[3] Standard methods for the analysis of hybrid systems rely on numerical techniques for 
solving the differential equations and can be unsound on certain classes of hybrid systems. These 
techniques are susceptible to dimension explosion problems as the number of state variables 
increases. What has been sought is an approach that is both capable of generating sound abstractions 
and capable of being compositionally applied. 

[4] T>e application of modeling tools to hybrid qualitative and quantitative systems, 
including, in particular biological systems, poses significant challenges. A means of providing 
analytical proofs of the unreachability of certain states, as well as stability and bistability properties 



of complex systems is sought. This capability is outside the reach of traditional modeling for 
analysis methods. 

[5] Further, what is needed is the ability to reason in an automated or semi-automated 
manner about biological systems to answer complex questions about the biological relevance of 
complicated molecular or intercellular signaling or other biochemical reaction networks. 

SUMMARY OF THE INVENTION 

A. Hybrid Modeling 

[6] The invention provides for the generation of sound abstractions in every case, and an 
algorithm according to the invention may be compositionally applied. 

[7] HybridAbstraction, the invention taught herein, provides the construction of a series 
of successively refined, sound abstractions of a hybrid system. The resulting abstracted system is a 
finite "discrete state transition system" suitable for model checking. 

[8] Hybrid systems exhibit both continuous and discrete dynamics. The core algorithm of 
HybridAbstraction combines qualitative techniques for abstracting the continuous dynamics with 
predicate abstraction techniques for abstracting the discrete transitions. 

[9] Qualitative abstraction consists of keeping only the "sign" information of a finite set 
of functions or forms over the state variables of the system while ignoring the exact valuations of the 
state variables. The sign information is updated by recursively keeping track of the signs of the 
functions representing the derivative of the original functions. 

[ 1 0] HybridAbstraction includes: 

1 . Selecting the seed forms (a set of polynomials); - 

2. Adding higher order derivatives (or integrals) of the seed forms, using symbolic 
differentiation (or integration) algorithms; 



3. Constructing the abstract system over the abstract state space defined by the 3- valued (i.e. 
positive, negative, zero) valuation of each of the forms generated by steps 1 and 2 hereinabove. 
Automation of steps 1, 2 and 3 is accomplished by means of decision procedures, symbolic analysis, 
and theorem proving. 

[11] In step 1, certain functions can be selected as the seed forms. The choice of good 
seed forms results in better abstractions. This observation is valid for both linear and nonlinear 
dynamics and all kinds of functions (not necessarily polynomial). 

[12] Different sets of seed forms can be chosen for different discrete modes of a hybrid 
system. In particular, the algorithm described above can be applied compositionally. It can exploit 
the two kinds of compositions observed in hybrid systems: 

i) A hybrid system is a composition of synchronously executing hybrid automata; 

ii) Each hybrid automaton itself is a composition of synchronously executing, continuous 
dynamical systems. 

Thus, not all seed forms are saturated under all continuous dynamics in step 2. 

[13] Step 3 of the algorithm preferentially uses fault tolerant theorem proving. Sound 
procedures (even if incomplete) can be used to construct sound abstractions. In the case where all 
the seed forms are polynomials and the continuous dynamics are specified using polynomial 
expressions, then the theorem proving support required to automate the HybridAbstraction algorithm 
is the quantifier-free theory of reals, which is decidable. The HybridAbstraction algorithm can be 
applied on other, non-polynomial systems, and on more general seed forms. 
B. Biological systems as Hybrid models 

[14] The inventive method, herein referred to as HybridAbstraction, provides abstraction 
of a continuous or hybrid model of a biological system to a finite, discrete model. Subjecting the 
resulting model to discrete reasoning tools and decision procedures enables a desired output. One 
manner of performing HybridAbstraction includes application of the SAL toolkit (SAL: "Symbolic 



Analysis Laboratory", an SRI International developed language and toolset for describing and 
analyzing state transition systems). 

[15] Biological systems are described in terms of qualitative or hybrid (qualitative- 
quantitative) systems. For example, certain phenomena may be described in terms of discrete, non- 
analogue states: genes are switched on or off; enzymes are active or inactive; protein complex 
formation is favored or disfavored relative to environmental factors; introns are in or cut out of RNA 
gene products. 

[16] Describing these sorts of biological systems as qualitatively "on/off' fundamentally 
ignores the multiple modalities of the underlying physical processes - processes which are not 
simply "on" or "off. Practicing biologists are aware of the hybrid nature of the biological systems. 

[17] Computational approaches to biological systems, however, employ strictly 
quantitative reasoning, where all phenomena are described with exact differential equations 
describing concentrations of various substances. 

[18] The novel HybridAbstraction approach taught herein reconciles the strictly 
quantitative approach of computational system biology with the qualitative and 
qualitative/quantitative reasoning brought to bear by the modern practicing biologist. The inventive 
approach further enables analysis of a system in which many parameters are unknown. 

[19] The HybridAbstraction approach includes: a) Describing a biological system in both 
qualitative and quantitative terms. For example, a gene may be "on" or "off', yet metabolic enzyme 
reactions may be best described as continuous differential equations, b) Formally describing the 
property of interest: using descriptions of the property from decidable theories in the system and the 
property, enabling completely automatic reasoning; c) Carving up infinite possible state space into 
regions which are not differentiated by the property; d) Applying discrete system analysis tool: 
including, but not limited to, bounded model checkers, symbolic model checkers, explicit-state 
model checkers, SAT solvers, term rewriting systems, Knuth-Bendix completion tools, and/or other 
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system analysis tools such as infinite bounded model checking, arithmetic decision procedures, and 
decision procedures for combined theories; e) Concretizing the result of the application of the 
discrete system analysis tool to biological system of interest; f) Repeat. The result may further feed 
the analysis. For example, if a counterexample to the property is found, then the approach includes 
concretizing the counterexample trajectory back into the continuous domain, as well as providing 
output or display to the investigator. 

[20] HybridAbstraction can both suggest and verify properties of biological subsystems to 
enable modular reasoning about such subsystems. For example, biology includes many signaling 
motifs, and many instances of each motif. HybridAbstraction is useful in reasoning about a bistable 
switch motif, or an oscillator motif, and may enable reasoning about larger biological networks and 
systems. HybridAbstraction also enables reasoning about potential effects of perturbations to the 
biological system under investigation. One example of this is gene knock-outs or knock-ins, as well 
as environmental or drug interaction effects. 
C. Chemical Plant Control 

[21] HybridAbstraction also provides for modeling chemical processes in petrochemical 
and other industrial settings. More exact (and provably sound) methods for control of such systems 
can lead to more optimized performance and productivity. In particular, certain aspects of the 
chemical reactions underlying industrial processes are known (such as the balanced chemical 
equations involved), whereas other aspects, such as the dependence of the rate of reaction on 
continuous variables such as pressure and temperature and partial pressures or concentrations of 
reactants, products, and various poisons. Seed polynomials can be derived from the known aspects 
of the system, and the property of interest (such as stability, or optimized generation of target 
products) formally stated. The present invention provides human designers a better understanding of 
petrochemical plant operations and status, as well as tools to diagnose anomalous behavior, thereby 
producing safer and more efficient methods for chemical plant control. 



D. Nanoscale Computer Architectures 

[22] It is possible to fabricate nanoscale electrical components just a few nanometers in 
their smallest dimension. For example, silicon nanowires and carbon nanotubes a few atoms (~lnm) 
in diameter have been grown in laboratories. Also, single-molecule devices have been fabricated 
and shown to exhibit diode rectification, negative differential resistance, field-effect gating, and 
nonvolatile switchable behaviors. Unfortunately, all known nanoscale assembly techniques suffer 
from high degrees of uncertainty in placement: the individual components may be well 
characterized, but where the components end up is at least somewhat uncertain. Top-down assembly 
techniques such as photolithography do not provide the reliable placement of nanometer-scale 
devices in intentionally designed nanoscale patterns. 
D.l Nanoscale Crossbar 

[23] In the particular case of a nanoscale crossbar, which could be used as a digital 
memory device, or as a PLA-style controller implementing digital logic, a large Cartesian grid of 
nanoscale wires are formed, perhaps 100 by 100 up to several thousand by several thousand. The 
vertical wires and horizontal wires may thus cross at between ten thousand and many millions of 
points. Each crosspoint, however, may only cover an area of several square nanometers (say, 2nm 
by 2nm). In this area may be placed nonvolatile single-molecule nanoscale switches, or the 
horizontal and/or vertical wires may be assembled with nanoscale FLASH-memory-like floating 
conductive regions. In any case, each crosspoint is capable of storing approximately one bit of 
information, where reading and writing that information is performed through precise control of the 
electrical potential on the horizontal and vertical wires. 

[24] Unfortunately, the complete lack of top-down nanoscale assembly techniques, and the 
uncertainty of bottom-up (chemical) assembly techniques makes it very difficult to predict what the 
exact electrical characteristics of each switch point might be. Exacerbating the problem are the 
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numbers of switches in electrical contact with each nanowire: uncertainty can thus be compounded a 
thousand fold. 

[25] HybridAbstraction can be advantageously used to perform model-based discovery 
and configuration of nanoscale computer architectures assembled from nanoscale components. A 
nanoscale crossbar may be assembled with a high degree of uncertainty in molecular placement at 
the crosspoints; HybridAbstraction techniques can be used to refine a model of the crosspoint, and 
thus to enable the reliable use of the crosspoint to store information. For example, the threshold 
voltage and current which causes reliable switching of nonvolatile switches, and the threshold 
voltage and current which enables reliable reading of the stored information may not be consistent 
between crosspoints. HybridAbstraction may be used to discover a model of such a crossbar and to 
enable the use of that model to configure the nanoscale device. Many nanoscale assembly 
techniques exist to burn-in or permanently or semi-permanently adjust configuration bits. After a 
reliable model of a nanoscale memory device is constructed through HybridAbstraction, an 
automatic system may set these configuration bits, to maximize the utility of the device as a digital 
memory or controller. Thus HybridAbstraction is useful as a configuration tool, perhaps only once 
after the nanodevice is first manufactured, or perhaps periodically, as the device behavior changes 
due to random faults or environmental stresses. 
D.2 Nanocell 

[26] Taking seriously the lack of determined top-down assembly, researchers have 
developed the concept of a programmable nanocell, which welcomes the uncertainty of initial 
fabrication, and exploits the reconfigurability aspects of some nanoscale devices in order to construct 
some digital device of interest. In particular, the nanocell concept contemplates a rectangular well, 
bordered by between two to several hundred CMOS-lithographic-scale leads, containing tens to 
hundreds of nanoparticles, randomly bridged by connections often containing nonvolatile switchable 
molecules. Each nanocell can be viewed as a nanoscale plant, where the connections and the state of 
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those connections must be modeled accurately enough to enable configuring internal connections 
into a state where the nanocell behaves as some target device of interest (e.g. an adder). The 
HybridAbstraction can be used to create the necessary model, and seed forms chosen based on the 
electrical probes attached to the outside of the nanocell well. 

D. 3 Nanoscale "Trim Bits" Help Push CMOS Limits 

[27] As "standard" photolithographic CMOS assembly techniques are pushed to smaller 
and smaller scales, uncertainty in device performance leads to suboptimal settings of power and 
clock frequency. HybridAbstraction can be used to determine "trim bits" which might be 
implemented with nanotechnology nonvolatile switches, placed very near CMOS devices of interest. 
For example, the tolerances of photolithographic manufacture of a CMOS transistor may not be 
sufficient to accurately control its threshold voltage, but bottom-up-assembled nanoscale switches 
sprayed on top of the entire CMOS circuit may be selectively switched to bring the CMOS 
transistors back within design tolerances. These "trim bits" must be set based on hybrid models of 
the underlying CMOS devices, and HybridAbstraction may be used to build such models. 

E. Molecular Modeling 

[28] HybridAbstraction can also be used to model the hybrid continuous and discrete 
behaviors of electrons in small molecules, and the physical reshaping of molecules such as protein, 
RNA, andDNA. 

E.l Small Molecule Electrical Characteristics 

[29] The precise electrical characteristics of small molecules can be determined 
experimentally only at very great cost and effort using such techniques as break junctions (where a 
metal wire is drawn as thin as possible, and then intentionally broken, then a molecule of interest is 
allowed to assemble bridging that break) and electrically instrumented atomic force microscopes. 
The computer modeling of electrical properties of small molecules proceeds today with large 



numerical simulations. HybridAbstraction could be used to model the discrete and continuous 
aspects of such systems, and to predict properties of interest. 

[30] For example, molecules such as 2,5 diethynylphenyl-4-nitroaniline or 2,5- 
diethynylphenylnitrobenzene have been shown to exhibit very strong negative differential resistance, 
the unusual property where the resistance falls as voltage is increased. These devices also exhibit 
switchable states that are retained for long periods of time. These properties can be exploited to 
create nanoscale computer memories and logic devices. However, the computer modeling of such 
properties is very difficult and prone to error. Rare examples of qualitative confirmation of 
computer predictions are far outweighed by incorrect predictions. 

[31] HybridAbstraction can be used to model the differential equations governing the 
movement of electrons around a molecule, and associated physical conformation changes. Classic 
modeling of molecular systems uses finite element methods and is subject to numerical instability 
and imprecision. HybridAbstraction can be driven from a given abstract property of interest (ex. 
electrical conductivity) and an underlying model the governing differential equations, to build an 
abstract model of the molecule sufficient to address the property of interest. 
E.2 Protein Folding 

[32] HybridAbstraction techniques can also be applied to help understand the energetically 
favored physical conformation of large molecules, and the shape of the electrical potential around 
them. Proteins are assembled as sequences of amino acid residues based on mRNA templates. Once 
assembled, proteins fold into their energetically favored conformation. In some cases, proteins are 
folded further by other proteins, and/or subsequently modified post-translation, and/or activated or 
deactivated (e.g., by phosphorylation). The computational prediction of protein folding is a major 
challenge. 

[33] HybridAbstraction can be used to build models of large protein molecules, and of 
their external electrical potential. The differential equations governing the interaction between the 



amino acid residues, and between the folded protein and a given small molecule can be abstracted 
into simplified form by HybridAbstraction to enable efficient computational predictions of physical 
properties of those proteins, and the binding of proteins together, and the binding or docking of small 
molecules with proteins. For example, predicting the behavior of g-protein-coupled receptor 
proteins is very difficult, because they repeatedly penetrate the cell wall. HybridAbstraction can be 
used to build a simplified model of a bilipid layer (perhaps as a 2-d fluid), surrounding aqueous 
envelope, and aspects of the protein conformation can be predicted. Conformation change of g- 
protein-coupled receptors upon signaling molecule docking can be predicted in a similar fashion. 

[34] HybridAbstraction can be applied to other long polymers such as RNA and DNA. 
The folding of RNA and the excision of introns from RNA sequences can be addressed with 
HybridAbstraction. The binding of proteins to DNA, and the remodeling of chromatin-DNA 
complexes can also be modeled with HybridAbstraction techniques. 
F. Monitoring and Diagnosis 

[35] Abstract models constructed by HybridAbstraction can be used for a variety of 
applications. Firstly, they can be used to analyze the behavior of the original model. Since the 
abstract models are discrete and finite-state, formal verification approaches developed for analysis of 
discrete-state transition systems can be used for their analysis. Secondly, HybridAbstraction can also 
be used in monitoring and diagnosis of systems, as well as for model validation on such systems. 
HybridAbstraction can be used to perform model-based monitoring and diagnosis of complex 
systems that admit hybrid models. For this application, given a hybrid model of the plant and 
controller, HybridAbstraction is used to create an abstraction by picking the seed forms based on the 
sensors that are attached to the plant. This abstraction is computed offline: Now, a monitoring and 
diagnosis system is built by using the sensor readings generated by the system at runtime to make 
transitions on the abstract model. A fault is detected whenever the monitoring system finds an 
inconsistency between the actual sensor readings generated by the real system and the possible 
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transitions on the abstraction. Diagnostic information can then be generated by retracing the path on 
the abstract system. 

[36] HybridAbstraction can be used for model validation if the monitoring process 
described above is moved up in the design cycle for development of the system. More specifically, a 
monitor, based on an abstraction, is built for the plant model and the actual plant is monitored 
against this abstract model as described above. Any discrepancies between the actual plant model 
and the abstract plant model points to inaccuracies in the model. Thus, the model of the plant can be 
refined and validated using this approach. The design of embedded software, such as controllers, for 
the plant, is then done over the validated plant model. The crucial aspect of HybridAbstraction as 
used in these two applications is the fact that the abstract models are much simpler compared to the 
original models and, hence, they can be executed along with the actual plant (or the actual plant and 
the controller) in real time. 

G. Controller Synthesis 

[37] HybridAbstraction provides alternative technology for synthesizing controllers for 
hybrid plant models. The discrete transition system abstract system generated by HybridAbstraction 
is amenable to both forward and backward search. Hence, controllers that guarantee desired 
behavior of the abstract plant model can be synthesized using the standard search based techniques. 
Controllers synthesized for the abstract models are safe for the actual plant model due to the 
soundness guarantees provided by the technique. 

H. Hybrid Automata 

[38] Hybrid systems are modeled as a composition of finitely many hybrid automata. 
Each hybrid automata is represented in one of two ways: 

Standard hybrid automata: A hybrid automata is specified as a finite state automata with 
continuous dynamics given inside each state (using differential equations). 
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Inside-out hybrid automata: A hybrid automata is specified using a single specification of 
the continuous dynamics and complex (If-then-else) expressions for specifying mode changes (that 
is, updates to continuous and discrete variables). 

[39] A combination of these two representations can also be used for representation of 
hybrid systems. All these styles of specification of hybrid automata and hybrid systems are inter- 
translatable, but the translated automata are generally too large to be amenable to analysis. The 
HybridAbstraction technique can construct abstractions directly from these representations, avoiding 
the translation problems. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Fig 1 is illustrative of the use of HybridAbstraction within an analysis framework. 
Fig 2 depicts an abstract thermostat system produced according to the invention. 
Fig 3 depicts results of HybridAbstraction applied to a biological system. 
Fig 4 relates to biological system modeling. 
Fig 5 relates to biological system modeling. 



DETAILED DESCRIPTION OF THE INVENTION 

Introduction 

[40] Hybrid systems describe a wide class of systems that exhibit both discrete and con- 
tinuous behaviors, such as a digital system embedded in an analog environment. Since hybrid 
systems frequently operate in safety-critical domains, for example, inside automobiles, aircrafts, and 
chemical plants, analysis techniques are needed to support the design process of embedded software 
for controlling hybrid systems. 

n 



[41] The development of tools and analysis techniques for hybrid systems faces two 
challenges. It has been shown that checking reachability for very simple class of hybrid systems is 
undecidable. Several decidable classes have been identified, but all of these classes are too weak to 
represent hybrid system models that arise in practical applications. In fact, the models of the physical 
environment in real world scenarios are usually too large and complicated even for analysis tools 
built on semi-decision procedures and other currently available technologies. 

[42] Abstraction is a technique to reduce the complexity of a system design, while 
preserving some of its relevant behavior, so that the simplified system is more accessible to analysis 
tools and is still sufficient to establish certain safety properties (those properties that are true of all 
states). Two powerful abstraction techniques, called predicate abstraction and data abstraction 
respectively, have been used quite successfully in analyzing discrete transition systems. The 
invention includes a very simple, yet quite effective, technique based on data abstraction, to 
construct a series of successively finer abstractions of a given hybrid system. 

[43] Hybrid automata are mathematical models for representing hybrid systems. In 
contrast to discrete transition systems, hybrid automata can make both discrete and continuous 
transitions and hence, their semantics are given in terms of the states, which are uncountably many, 
reached over a continuous real time interval. However, the theory of hybrid automata can be given in 
terms of infinite-state transition systems that contain uncountably many states, but are interpreted 
over discrete time steps. In HybridAbstraction, the uncountable state space is mapped into a finite 
state space by the inventive abstraction function. More specifically, the n-dimensional real space R n 
is partitioned into zones that are sign-invariant for all polynomials in some finite set. Increasing the 
number of polynomials in the set results in finer abstractions. 

[44] Significantly, HybridAbstraction extends known qualitative techniques in at least two 
ways. First, the evolution of arbitrary polynomials (over the state variables) is tracked, and not just 
the state variables, while constructing an abstraction. Second, whereas qualitative reasoning usually 



uses the sign of only the first derivative, the invention uses the signs of first n* derivatives. These 
two non-trivial extensions make HybidAbstraction substantially more powerful. 

[45] HybridAbstraction as illustrated in Figure 1 provides for the construction of a series 
of finer abstractions, enabling an iterative methodology to prove a safety property. Starting with a 
description of a hybrid system 20 and a property of interest 10, HybridAbstraction 30 creates a first 
(relatively crude) abstraction 40 (also known as a discrete approximation). A formal methods tool 
50 then checks whether the property of interest 10 holds for this abstract system 40. This is 
represented as a true or false result 60, optionally including a counter example for falsehoods and a 
proof for truths. If the property of interest 10 is false, the invention provides for the creation of a 
finer abstraction by an iterative analysis 70 and for checking the property again by formal methods 
tool 50. This iterative analysis 70 can be repeated until the property is either established or no further 
refinements of the system can be constructed. Because the resulting abstractions 40 are discrete 
transition systems, techniques such as model checking can be used as formal methods tool 50. 
Furthermore, HybridAbstraction is different from many other approaches to hybrid system analysis 
in that it does not use any numerical methods and techniques. 

[46] The process of construction of the abstract system requires logical reasoning in the 
theory of reals. The first-order theory of real closed fields is known to be decidable. In 
HybridAbstraction, the first-order theory of reals is used to represent sets of continuous states; 
HybridAbstraction then uses reasoning over the first order theory of reals for creating abstract 
transition systems. 

Preliminaries' " " 

[47] The signature of the first-order theory of reals consists of function symbols {+,-,• }, 
constants R, and predicate symbols {=,>,>,<,<}. In this theory, the set of terms over a set X of 
variables corresponds to the set of polynomials R[X]. The set ATM(X), defined as {p ~ 0 : p e R[X] 
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and ~e {=,>,><,<}}, is the set of all atomic formulas. The set WFF(X) of first-order formulas 
(over X) is defined as the smallest set containing ATM(X) and closed under the boolean operations 
(conjunction a , disjunction v , implication => , and negation ) and quantification (existential 3 
and universal V ). The first-order theory of reals, also denoted by E, is defined as the set of all first- 
order formulas over the above signature (and a countable set of variables) that are true over the real 
numbers. The notation R \*<p is used to denote the fact that the (first-order) formula (p is true in 
the theory of reals. The first-order theory of the real closed fields is a complete theory, i.e., every 
sentence in WFF(X) is either true or its negation is true in this theory, and the theory is known to be 
decidable, 

[48] Formulas in WFF(X) are denoted by <p, i//, possibly with subscripts, and use p to 
denote polynomials in the set R[X]. A polynomial p occurs in a formula <p if there is an atomic 
formula p ~ 0 in <p. The rest of the notation follows the standard practice in hybrid systems 
literature. 

Continuous Dynamical Systems 

[49] For simplicity, hybrid systems with no discrete components are considered in this 
section, that is, hybrid systems with exactly one mode of operation. A continuous dynamical system 
CS is a tuple (X, Init, Inv, f) where X is a finite set of variables interpreted over the reals M, X = M x is 
the set of all valuations of the variables X, Init c X is the set of initial states, Inv e X is the 
invariant set of states, and /: X h> 7X is a vector field that specifies the continuous dynamics. Here 
7X denotes the tangent space of X. Assume that / satisfies the standard assumptions for existence 
and uniqueness of solutions to ordinary differential equations. The continuous dynamical systems 
considered here are autonomous: they have no inputs. 

[50] The semantics, [CS], of a continuous dynamical system CS = (X, Init, Inv, f) over an 
interval / = [t^tJ c R is a collection of mappings a : / X satisfying 
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(a) initial condition: G(r a ) € Init, 

(b) continuous evolution: for all r € (r fl ,r z ), & (r) = / (o(r)), and 

(c) invariant: for all re [r^rj, a (t) g Thv. 

In case the interval / is left unspecified, it is assumed to be the interval [0, oo ). 

[5 1 ] Assume that the flow derivative, / is specified using polynomial expressions over the 
state variables X, that is/ e (R[X\) |x| , where R[X] denotes the set of polynomials over the 
indeterminatesXand coefficients in % and denotes the cardinality ofX. These polynomials can 
be nonlinear in general. 

[52] Example L The actuator module in a simple electronic throttle control system is 
driven by a pulse- width modulated signal and can be described as a hybrid system with two modes: 
when the input signal is high, the system is in the "on" mode and is described by 

K = 2222( 24 -2K-/) 7 = ^(120-227) 

[53] and when the input is low, the system is in "off ' mode and is described by 

^-2000 /= 2000( ) 
3 15 V } 

In each mode, therefore, the actuator behaves as a continuous dynamical system with two continuous 

variables Fand/. 

Discrete Transition Systems 

[54] A discrete state transition system DS is a tuple (Q, Init, t) where Q is a finite set of 

variables interpreted over countable domains, Q denotes the (countable) set of all valuations of the 

variables Q over the respective domains, Init c Q is a set of initial states, and tcQx Q is a set of 

transitions. The semantics, [DS], of a discrete state transition system DS = (Q, Init, t) is the 

collection of all mappings 6 : N Q satisfying 

(a) initial condition: 0(0) € Init, and 



(b) discrete evolution: for all i e N, (0(0, 9 (i + 1)) € t. 

In order to define a notion of abstraction precisely, it is necessary to establish a 
correspondence between discrete evolutions 9 : N i-> Q and continuous evolutions a : [0, oo) 
Q. This is done using discrete sampling. 

[55] Definition 1. A discrete evolution 9 : N h-» Qis a sufficiently complete dis- 
cretization of a continuous evolution a:[0, oo) h> Q if there exists a strictly increasing sequence 
( To>ti,X2 ... ) of reals in the interval [0, oo ) such that 

(i) r 0 =0, 

(ii) the function a does not change on the domain (x h x i+J ) f that is, a (x) = a (x 9 ) for all r, i 
€ (x, x i+l ), and 

(Hi) for all i f 9 (2i) = a (xd and 9 (2i+l) = a (x), where r, < r < r l+ y. 

[56] Intuitively, a sufficiently complete discretization captures all the "different" (abstract) 
states in the given continuous evolution. 

[57] Definition 2. Let CS = (X, Init, Inv, f)bea continuous dynamical system and DS = 
(Q, InitQ, t) be a discrete transition system. DS is an abstraction for CS if there exists a mapping abs 
:Xh> Q such that 

(a) abs (InitX) c InitQ, and 

(b) for every a e [CS], ifc' is a sufficiently complete discretization of abs(c), then 
a' e [DS]. 

[58] Here, abs is used to also denote liftings of the function abs to sets and functions. 
Thus, abs (InitX) = {abs(x) : x e InitX}. Similarly, if a : [0, oo) j-> X, then (abs(o))(t) = abs(c(x)). 
This definition of abstraction corresponds to the usual sense of abstraction, but is applied here to the 
infinite state transition system associated with a continuous (hybrid) system. The problem of 
constructing discrete transition system abstractions for continuous dynamical systems in the sense of 
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Definition 2 is considered next The definition and the procedure for constructing an abstraction 
naturally extends to hybrid systems, see Hybrid Automata discussion that follows herein. 
Abstracting Continuous Dynamical Systems 

[59] Data abstraction refers to the idea of using a partition of the domain of interpretation 
of a discrete system as the new domain of interpretation (in the abstracted system) for the state 
variables, or expressions over the state variables. The invention includes performing data abstraction 
on continuous and hybrid systems. Abstract variables are used that represent polynomials over the 
original continuous variables X and the abstract variables are interpreted over a three valued abstract 
domain {neg, pos, zero). 

[60] Given a continuous dynamical system CS = (X t InitX, Inv, J), the invention constructs 
the abstract discrete state transition system DS = {Q, InitQ, f) in two steps. The first phase creates a 
finite set P e E[X] of polynomials over the continuous variables X which are used as the discrete 
variables Q. In the second phase, the initial states InitQ and the transition relation t are computed. 
Phase I: Obtaining a set of polynomials 

[61] Fixing the set P of polynomials for abstraction involves starting with a small set P 0 of 
polynomials of interest (the "seed forms") and adding to this set the time derivatives of polynomials 
in P 0 .The initial set Po could contain, for example, the polynomials that appear in the statement of 
the property of interest one wants to establish for the given continuous system, or the polynomials 
that occur in the guards of mode change transitions. The phase I saturation process involves 
application of the following inference rule: ifp e P y then add p } the derivative (with respect to 
time) of p, to the set P unless p is a constant or a constant factor multiple of some existing 
polynomial in P. 

The linear polynomials whose coefficients form a left eigenvector of the A matrix of linear systems 
should be preferably chosen as seed polynomials for hybrid systems with linear dynamics. In the 
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case of nonlinear systems, suitable seed polynomials are generated using techniques from algebraic 
geometry. Lyapunov functions are also good choices of seed polynomials. 

[62] Because it is assumed that / e (M[X]) |x| , it follows that p e R\X] is a polynomial. 
However, note that for general flow derivatives / specified using arbitrary polynomial expressions, 
the saturation process might not terminate. But there are special cases where this process is 
guaranteed to terminate. 

[63] Nilpotent Systems. Consider the class of linear time invariant systems specified 
using a nilpotent matrix A. If X is also used to denote the column vector of state variables X then 
the flow rate / = AX and hence, X = AX. A polynomial p = ^.aoa can be written, in matrix 

notation, as a T X f where a T denotes the transpose of a. Thus, p = a r X = a T AX and p = a T A 2 X 
Hence, if A n = 0, then the w-th derivative of the polynomial p is a T A n x = 0. Thus, the saturation 
process is guaranteed to terminate for such systems. 

[64] Systems such that A n = rA m . If the matrix A used to specify the flow of the 
continuous dynamical system CS is such that A n = rA m for some constant rel and n y m e N, then 
again, the saturation process can be shown to terminate. In particular^ ifp = a T Xis an arbitrary 

d n D d m d 

polynomial, then = a T A n X = a T rA m X = Because the n-th derivative of/? is a constant 

multiple of the m-th derivative of p y it does not get added to the set P of polynomials in the 
saturation process. The termination of the saturation process is determined by both the initial set Po 
of polynomials and the flow derivative / 

[65] General Case* The inventive abstraction technique works for general (possibly non- - _ 
linear) time invariant systems whose flow is specified using polynomials. The termination of the 
saturation phase is not necessary for creating an abstraction. It is possible to stop at any point and 
pass on the current set P to the second phase. A larger set P yields a finer abstraction as it results in a 
larger state space in the final abstract system. 
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[66] Example 2. Consider the "off" mode of the actuator in Example 1 . Starting with the 
set Po = {V, 1} of polynomials, the phase I saturation procedure first adds the polynomial / = 
2000/15(5F- 16/) and then the derivative of this polynomial /= 2000 2 /15(-16F/3+77//5). Because 
the derivative V is a constant multiple of the polynomial / e P, it is not added. Note that the exact 
derivatives do not need to be added, but only a polynomial up to some constant factor. Although the 
process of adding derivatives can be continued, this example stops the phase I here with the final set 
P= {V } I 5F- 16/, - 80K+ 231/}. 
Phase II: Constructing the Abstract Transitions 

[67] Let CS « (X InitX, Inv, j) be a continuous system and P c R[X] be a finite set of 
polynomials over the set Xof variables produced by the first phase. The state variables Q in the 
corresponding abstract discrete system DS = (Q, InitQ, f) contains exactly one new variable for each 
polynomial p e P. Thus, Q= {q p :p e P}. These new variables are interpreted over the domain 
{pos, neg zero} and consequently the set Q of all discrete states is the set {pos, neg, zero} 8 of all 
valuations of the variables Q over this domain. Any such valuation is represented by the 
corresponding conjunction of atomic formulas. For example, the valuation ( q p i h-> pos, q p2 h-> 

neg, q p3 h-> zero ) will be thought of as the formula p x > 0 a pi < 0 a p 3 = 0. Such conjunctions 

and valuations are used here interchangeably. The set of all conjunctions representing such 
valuations will also be denoted by Q. Note that these conjunctions are in the set WFF(X) of formulas 
over free variables X. 

[68] If y/e Q is a state in the abstract system DS, represented, for example, as 

then the concretization function, y, maps abstract states to sets of concrete states and is defined by 
y(y/)={xeX:R fcpfc) > 0 V i*J x and E \pix)< 0 V ieJ 2 and 
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Here, the notation E ^ p(x) > 0 means that the polynomial p evaluates to a positive number on the 
point x € E |X| . 

[69] Conversely, if x e X is a concrete state of the system CS, then the abstraction 
function, abs, maps a concrete state to an abstract state and is defined by, 

abs(x)= /\ /7.>0 A /\ p.=Q A /\ p.< 0 

[70] whereJiU J2 U J3 is a partition of the set {1,2,..., |P|} such that 1 € ^ iff R ^p { (x) 
>0, / e h iflfR h Pi(x) = 0,/ g J 3 iff R | B Pi(x)<0. 

[71] The Initial States. Assume that the initial set of states InitX for the continuous system 
is specified using a first-order formula q>x over X. The initial set of states InitQ consists of all 
valuations y/ of the abstract variables such that the formulas y/ and <p x are simultaneously satisfiable. 
Specifically, 

InitQ = v{y/eQ:R ^3X:y/ a (p x ). 

[72] Lemma l.LetCS = (X, InitX, Inv, J) be a continuous system with the initial states 
InitX specified by the first-order formula <px. IfDS, InitQ, and abs are as defined as above, then, 

abs(InitX) c InitQ. 

[73] A formula and the set of valuations it represents are used interchangeably as the 
context disambiguates the intended meaning. 

[74] The Transition Relation. An abstract transition (^1,^2) e t is added if all of the 
following conditions hold (for all polynomials p e P): 

(a) if p < 0 is a conjunct in y/\, then 

(al) if R ^ y/\ => p < 0, then p < 0 is a conjunct in y/ 2 ; 
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(a2) if R ^ if/\ z=> p = 0, then p < 0 is a conjunct in 1//2; 

(a3) if R h P > 0, then either /?<0or/; = 0isa conjunct in and 

(a4) if the valuation of /? cannot be deduced from y/\, then either /?<0or/? = 0isa conjunct 

in n\ 

(b) if p = 0 is a conjunct in y/i, then 

(bl) if R ^1 => p < 0, then /? < 0 is a conjunct in y/ 2 ; 
(b2) if R |= y/\ /> = 0, then ;? = 0 is a conjunct in y/ 2 ; 
(b3) if R |= => /> > 0, then p > 0 is a conjunct in and 

(b4) if the valuation of p cannot be deduced from y/\ then either > 0, p = 0, or < 0 is a 
conjunct in ^ 2 ; 

(c) ifp > Oisa, conjunct in then 

(cl) if R ^1 z$> p < 0, then either /?>0or/? = Oisa conjunct in y/ 2 ; 
(c2) if R |=^i p = 0, then /? > 0 is a conjunct in ^ 2 ; 
(c3) if R f 5 ^1 p > 0, then p > 0 is a conjunct in ^ 2 ; and 

(c4) if the valuation of p cannot be deduced from \f/ h then either j9>0or/? = 0isa conjunct 

in \f/2. 

[75] This completes the phase of adding transitions to the abstract system. Note that the 
sign of p can be directly read off from y/\ if p was added to P in phase I. If not, then non- 
deterministic transitions from y/\ are added assuming all possibilities for the sign of p . In the final 
step, this abstract system is refined to eliminate unreachable states and transitions. 

[76] Refining the Abstraction. Certain abstract states (and transition to/from those 
states) can be deleted because either they are infeasible or are explicitly disallowed by the given 
invariant set Inv of the concrete system. In particular, if the invariant set Inv is specified using a first- 
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order formula (p^ then one may delete all abstract states i// such that R ^3X:y/(X) A <p Inv (X). One 
may also remove all transitions to/from these eliminated abstract states. Note that this process 
implicitly removes infeasible abstract states, that is, states \// (X) such that Rjf3X : y/ (X). 

[77] This completes the construction of the abstract system DS = (Q, InitQ, t) for the 
continuous dynamical system CS = (X f InitX, Inv,f). 

[78] Theorem 1. Let CS = (X, InitX, Inv } j) be a continuous system andDS = (Q, InitQ, t), 
be the discrete abstraction as defined above. Then, DS is an abstraction (Definition 2) for CS. 

[79] Even though the abstract transition system is a finite-state system, one need not 
explicitly represent the states and transitions. The abstract system can be obtained implicitly with 
the states and transitions specified using predicate formulas. 

[80] Example 3. Following up on Example 2, the abstract transition system may be 
constructed on the set P = { V, I, 5 V-16I, - 80F+ 23 11} of polynomials. Assume that the initial 
abstract state is 7>0 a V>0 a 5F-16/<0 a -80K+231/>0. (The initial abstract state is obtained 
from the stable states in the abstract transition system for the "on" mode of the actuator.) 

[81] Of the 3 4 = 81 abstract states, only 17 are feasible and the infeasible states can be 
identified using a theorem prover. For example, the state 7=0 a V>0 a 5F-16/<0 a -80V+2317 
> 0 is infeasible and a decision procedure for the reals can be used to deduce that this formula is 
unsatisfiable, and can therefore be removed as described earlier to refine the abstraction. 

[82] The outgoing transitions from the initial state /> 0 a V> 0 a 5VA6K 0 a- 
80F+231/>0 are obtained as follows: (a) since 7 > 0and / < 0 (as 5 V- 16/ < 0), in the successor 
state either / > 0 or / = 0, (b) since V> 0 and V < 0 (as -/ < 0), in the successor state either V> Oor 
V=0, (c) since 5 V- 16/ < 0 and 5 V - 16 / > 0 (as -80V + 23 1/ > 0), in the successor state either 
5K-16/<0or5F-16/=0,and(d)since-80K+231/>0and-80F +231/ is unknown, in the 



successor state either -80F+ 2317> 0 or -80F+ 2317= 0. Of the 16 potential successors, only 4 are 
feasible: 

qi :7>0 a V>0 a5F-167<0 a -80F+2317>0 
q 2 :7>0 a V = 0 a5F-167<0 a -80F+2317>0 
q 3 :7>0 a V>0 a5F-167<0 a -80^+2317=0 
q 4 :7=0 a V = 0 a5F- 167=0 a -80^+2317=0 
[83] Continuing this way, the complete abstract system containing ten reachable abstract 

states can be constructed. It can also be determined that among these states, only the state is 

stable. 

Hybrid Automata 

[84] The technique for constructing finite state abstractions of continuous systems (as 
described above) extends naturally to hybrid systems. 

[85] The abstract system corresponding to the hybrid system HS = (Q, X, Init, Inv, t, f) and 
a finite set P of polynomials (over X) is a discrete state transition system DS = (Q 4 , Init*, t 4 ), where 
Q* = 2 U (Q P = {q P : P e 7"}) is the set of discrete variables, Init 4 c Q A is the initial states, and t 4 
cQ A xQ A is the set of transitions. The new discrete variables Q P are interpreted over the domain 
{pos, neg, zero) as before. Thus, the set of states in the abstract system Q A is Q x fpos, neg, zero} Q P 

[86] Let cf = (q, (p) € Q A be a state in the abstract system, where q e Q is a discrete state 
of the hybrid automaton HS and <p is a valuation of the variables in Qp over {pos, neg, zero). As 
before, <p is thought of as a formula in WFF(X). The transitions in the abstract system DS from the 
state q* are obtained as a union of two kinds of transitions: 

[87] Abstractions of the discrete transitions: If (q, Cond, q') e t is a discrete transition of 
the hybrid automata HS, where y'6 Q are discrete states and Cond e X is a set of continuous 
states (or the guard) represented by, say, the predicate formula y/ over the variables X, then there is 
an abstract transition {{q, q> \{q f , q>)) e t if R f= 3X: (<p(X)) a ( if/ (X)). 



[88] Abstractions of the continuous transitions: The rule for constructing new abstract 
transitions from the continuous flows is the same as before. The first component of the state is left 
unchanged: a new abstract transition {{q, cp \{q, y/)) in f is added if if/ can be obtained from <p 
using the rules given above (such rules being applied to the flow corresponding to the discrete state 

[89] Cases may therefore be handled where the set Q of discrete states in HS is infinite 
provided that the number of distinct "modes" (each of which can be specified as a formula over Q) 
are finite. 

[90] Theorem 2. Let HS = {Q, X, Init, Inv, t,f)bea hybrid automata and P e R[X] be a 
finite set of polynomials over the set X of real variables. IfDS = (Q 4 = Q U Qp, Iniff) is the 
discrete transition system constructed by the above method, then DS is an abstraction for HS. 

[91] The following illustrates the abstraction technique on a simple hybrid system 
example. 

[92] Example 4. Consider a thermostat that controls the heating of a room. Assume that the 
thermostat turns the heater on when the temperature x is between 68 and 70 and it turns the heater 
off when the temperature is between 80 and 82. Suppose the continuous dynamics in the on and off 
modes are specified respectively by the equations 

x = -X + 100 and x = -x 

[93] Assuming that the heater is initially off and the room temperature is between 70 and 
80, the hybrid automaton is given by HS = (Q X, Init, Inv, t,f), where Q = {q\} is the set of discrete 
variables, Q = {on, off) is the set of discrete states (thus, q\ e {on, off}), X = {xi} is the set of 
continuous variables, X = E is the set of continuous states, Init - {(ojfx) : 70 < x <■ 80} is the initial 
condition, Inv = {{on t x) : x < 82} U {{off, x) :x> 68} is the invariant set, / = {{on ,x,off)\x> 
80} U {{off, x, on):x< 70} is the set of discrete transitions, mdf(ori) = — x + 100 andflpfjf) = -x 
specifies the continuous flow rates. 



[94] Now, the set of polynomials that appear in the guards are {x -70, x -80}, and 
polynomials in the invariant specification are {x - 68, x- 82}. The derivative of each of these four 
polynomials is x . In the mode when the heater is on, this evaluates to -x + 100 and in the mode when 
the heater is off, this is -jc. Hence, there are two more polynomials, {jc, x - 100}, in the set P. Further 
saturation of the set P of these six polynomials under time derivative yields no new polynomials. 

[95] Using the saturated set P of six polynomials, an abstraction for the thermostat hybrid 
model can be constructed and the final result is depicted in Figure 2. In the figure, transitions arising 
from the continuous and discrete evolutions of H are depicted by dashed lines. Furthermore, the 
representation of abstract states has been simplified. For example, the expression 70 < x < 80 
denotes the conjunction 70 < jc a x < 80 a 68 <x a x < 82 a -jc + 100 > 0 a x > 0. This 
conjunction is logically equivalent to 70 < x a x < 80. 

[96] The SAL (Symbolic Analysis Laboratory) tool set provides interfaces that can be 
used to construct discrete abstractions of hybrid systems as described herein. Other tools known to 
those in the art may be used for testing, analysis and checking, and this discussion is exemplary, and 
not to be construed as a limiting. The quantifier elimination decision procedure for the real closed 
fields is implicitly used to decide the implications over the real numbers. The tool QEPCAD 
[Quantifier Elimination in elementary algebra and geometry by Partial Cylindrical Algebraic 
Decomposition], which is built over the symbolic algebra library SACLIB, has been integrated to the 
theorem prover PVS for this purpose. 

[97] The illustrative discrete abstractions constructed do not store information about the 
duration of a continuous run. However, HybridAbstraction extends, quite easily, to time variant 
systems by simply explicitly "considering time as another continuous variable: Some timing 
information can be included in the abstractions if polynomials containing this variable for time are 
included in the set P. 



[98] Qualitative reasoning has been used for modeling and analyzing physical systems in 
the face of incomplete knowledge of the system dynamics. The idea is to interpret a continuous 
variable, say x, over an abstract domain of the form {(-oo,co), Co, (co,Ci), ci, (ci,C2), c 2 , Cn, (c n , 
oo)} 5 where Co, ci,..., c n e R are constants. Model construction involves keeping track of the sign of 
the derivative of x. HybridAbstraction substantially extends qualitative reasoning by allowing for 
arbitrary polynomials, and not just state variables, for defining the qualitative state space. 
Additionally, HybridAbstraction includes the use of signs of higher order derivatives in the 
procedure. The resulting abstractions have more information and are more useful, as they are more 
amenable to analysis. 

[99] Moreover, although the examples of abstractions shown do not retain any timing 
information apart from the temporal ordering of abstract states, it is known that timing information 
can be introduced either by treating t as another state variable with time derivative equal to 1, or by 
incorporating quantitative timing information in the process of constructing an abstraction. See O. 
Stursberg, et. al. Hybrid Systems IV, vol. 1273 of LNCS, pages 361-377. Springer- Verlag, 1997. 

[100] HybridAbstraction is amenable to mechanization. HybridAbstraction has 
applications to test vector generation for hybrid systems that would cover all regions of the state 
space, where a region is defined as the subspace which is sign-invariant for a set of polynomials. The 
invention also provides integration with methods that employ additional quantitative information to 
create an abstraction. 

HybridAbstraction and Biological Systems 

[101] Many databases, standard modeling languages, and tools are now becoming available 
for biological information having to do with network effects. In particular, the Biological 
Simulation Program for Intra-Cellular Evaluation (BioSPICE) is an open source development 
movement that is creating computational models, tools, and infrastructure to help deal with modeling 
and analysis of complex biological systems. 
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[102] However, despite the exponentially expanding oceans of biological data becoming 
available, to understand how cells compute and control themselves, accurate models that represent 
the aspects salient to the questions of biologists are needed. Even relatively simple prokaryotic cells, 
not to mention more complex eukaryotic (e.g. human) cells, are so complex that the models must be 
aggressively abstracted to enable more complete, deep, and scalable analysis, and to present results 
to biological domain experts. 

[103] The construction of mathematical models for biological processes is central to the 
science of bioinformatics and computational biology, but the inherent complexity of biological 
systems is daunting. Biological processes exhibit dynamics that range over a very wide time scale, 
contain stochastic components and sometimes discrete components as well. Sigmoidal nonlinearities 
are commonly observed in biological data correlation and a wide class of such functions is used in 
the resulting models. Biological processes operate at widely disparate different time and spatial 
scales, spanning twelve or more orders of magnitude (from single cell to entire organism). A 
complete model of a biological process is quite complex and poses a challenge for simulation and 
analysis. 

[104] Genetic regulatory networks that work inside a cell form one class of biological 
system. Such networks are responsible for various kinds of cellular behaviors, for instance, 
recording, computing, and reacting to changes in the environment. Behavior is controlled through 
complex interaction between various protein concentrations that are regulated by transcription of 
various genes, and which, in turn, positively or negatively influence transcription of other genes, 
thus resulting in complex interwoven networks of control. At a much larger scale, metabolism can 
be studied at the level of the whole human body. For example, glucose metabolism can be modeled, 
to determine tjie blood glucose concentration in human tissues. In these cases, a phenomenological 
model is constructed using tissue and organ level concentrations as basic state variables. 
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[105] Systems biology explores the quantitative study of biological processes as whole 
systems instead of isolated parts. Biological subsystems interact with one another to perform 
sophisticated biological functions, and a systems level view is necessary to understand the complex 
dynamics that underlie physiology in normal and diseased states. Systems biology research has 
focused on quantitative stochastic or differential equation models of biological systems. 

[106] Mathematical models of biological processes are often constructed by generating 
equations describing the physical laws that govern the system dynamics. The obtained model is 
tuned by determining free parameters and unknown rate constants using experimental data. 
However, this is not always possible, as quantitative experimental data is plagued with high levels of 
noise, and precise rates of reactions are for the most part unknown to science. Even when some rate 
constants have been inferred using algorithms for determining minimal error curve fits for available 
data points, the resulting model is just a "representative" that best matches all the available data. 

[1 07] The actual value of the parameter or rate constant is possibly stochastic in a given 
range, so there is a danger of overfitting quantitative models to the data, resulting in inaccurate 
predictions that are highly sensitive to small perturbations in input data. Moreover, the number of 
different state variables can grow quite large. Too many variables representing different molecular 
species involved in various compartments can adversely affect the ability to subsequently simulate 
and analyze the model. 

[108] To further complicate matters, assumptions about homogeneity and the presence of 
large number of molecules often break down at the cellular and cellular compartment levels. This 
means that mathematical models based on such assumptions cannot be expected to be accurate. 

[1 09] An alternative approach is to seek completely qualitative, rather than quantitative 
models of biological systems. Some examples of this approach include the high-quality curation and 
analysis of qualitative metabolic pathway information, symbolic analysis including term rewriting 
and model checking of curated pathway models, and other logical modeling approaches. However, 



the focus needn't be on completely logical or symbolic mathematical modeling of biological 
systems. Hybrid systems may be used as an underlying formalism for modeling and analyzing 
biological systems. The qualitative or hybrid qualitative-quantitative modeling and analysis of 
biological systems is referred to as Symbolic Systems Biology. 
Biological Hybrid Systems 

[110] Hybrid systems, as described earlier, are mathematical models obtained by formally 
combining continuous dynamical systems with discrete transition systems. 
Continuous Dynamics 

[111] In a hybrid system, the continuous dynamics of time varying variables are given using 
differential equations. In models from biology, the differential equations specify how the 
concentrations of various molecular species evolve over time. These differential equations are 
obtained using standard physical laws, such as the law of mass action and the law of mass 
conservation once the gene states are fixed by the discrete logic. 

[112] For example, consider the case when a species X reacts (reversibly) with another 
species 7 to form a complex XY. Schematically, this is represented by 

k x 

X+Y = XY 

where k \ and k .\ are the reaction rates for the forward and backward reactions respectively, and 
the arrow notation s is intended to mean reversible reactions. 

[113] If x,y 9 and z denote the concentrations of X, 7, and irrespectively, then using the 
law of mass action, which states that the rate of a reaction is proportional to the products of the 
concentrations of the reactants, a system of three differential equations is described: 

x =-k\xy + k-\z 

y =-k\xy + k-\z 



i = k\xy-k^\z 
[114] If a species, say X, participates in more than one reaction, say 

X+Y> = XYi for/=l,....,/, 

then its rate equation is obtained by collecting terms from each reaction in which it participates. 
Adding a source and a sink term, this becomes 

/ / 
i = £ kf l zj- £ kjy/x + rsrc-rsink, 

Equation 1 

where 3 represents the concentration of the complex XY J9 r src is the rate of production of X, and r sin k 
is the rate of utilization of ^(independent of the reactions that have been accounted for explicitly). 
For example, if X were a protein, then the effect of the production of Xby transcription would 
contribute a source term and its decay by proteolysis would contribute a sink term. 

[115] In the example of modeling the blood glucose in human, consider a typical 
physiologic compartment shown in Figure 4. The mass balances for this compartment can 
be written as 

VbCbo = QsiCsi - Cbo) +PA(Ci - Cbo) - rmc 
ViCi = PA{CBo-Ci)-rT 

where V B is the capillary blood volume, Vjis the interstitial fluid volume, Q B is the volumetric blood 
flow rate, PA is the permeability-area product, C B i is the arterial blood solute concentration, C Bo is 
the capillary (and venous) blood solute concentration, Q is the interstitial fluid solute concentration, 
rRsc is the rate of red blood cell uptake of solute, and r T is the tissue cellular removal of solute 
through cell membrane. In the first equation above, the first term on the right-hand side is the effect 
of convection, the second term corresponds to diffusion, and the last one is the metabolic sink. 
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Discrete component 

[116] Mathematical models developed by biologists are often continuous dynamical 
systems, as exemplified by much of the work in systems biology. 

[117] It is useful to consider hybrid discrete-continuous models to enable more complete, 
deep, and scalable analysis. Hybrid modeling and analysis can provide great leverage in the realm of 
complex biological processes, and can also provide abstractions useful in presenting results to 
human users. The discrete dynamics can arise in many different ways and a discussion of some of 
them follows. 

[118] The purely continuous models of biological systems can be too large and complex to 
be maximally useful for simulation and analysis. On the other hand, a fully discrete approximation 
of the model can sometimes lose crucial and pertinent information. Hybrid systems provide a 
rigorous foundation for modeling biological systems at desired levels of abstraction, approximation, 
and simplification. For example, systems that exhibit multiscale dynamics can be simplified by 
replacing certain slowly changing variables by their piecewise constant approximation. This is 
particularly useful when the property of interest is defined on a small time scale. Additionally, 
sigmoidal nonlinearities are commonly observed in biological data correlation and the corresponding 
models often use (continuous) sigmoidal functions. These can also be approximated by discrete 
transitions between piecewise-linear regions. Figure 5 shows a generic plot of data points and the 
corresponding sigmoidal curve (solid line) generated by tuning parametric sigmoidal curves. The 
solid curve is chosen to best match the available data, and the heavy dashed line is a piecewise-linear 
approximation of the data points. The light dotted lines represent nondeterministic bounds on 
behavior. In some instances, nondeterministic upper and lower bounds are more useful than 
deterministic approximations, because they capture all the behaviors of the system. 

[119] For example, gene transcription and translation lead to production of proteins in cells. 
The rate of transcription of the corresponding gene determines the source term in Equation 1 the 



differential equation for concentration of that protein. This term is, in general, a function of the 
concentrations of several other molecules that affect the transcription of the relevant gene. This 
influence of concentrations of proteins and sigma factors on transcription is conventionally modeled 
using nonlinear continuous functions. This function is usually a steep sigmoidal curve, which is 
described using higher-order polynomial expressions or hyperbolic trigonometric functions. 
Sigmoidal nonlinearities are also observed in many other biological data. For instance, in the case of 
glucose metabolism in human body, the normalized rate of peripheral glucose uptake (a sink term) is 
such a nonlinear function of the normalized peripheral interstitial insulin concentration. 

[120] The use of sigmoidal functions in biological models can be replaced by piecewise 
constant or piecewise linear approximations as shown by the dashed line in Figure 5, resulting in a 
hybrid model with a discrete mode change logic. Such effects are captured via discrete mode 
changes. A very steep sigmoidal curve can be approximated by a step function. In the gene 
regulation example, this corresponds to assuming that a particular gene can be in one of two states: 
"on" or "off \ A discrete transition describes how the various regulators combine to choose one of 
these two states. In completely qualitative modeling, one can represent such states with Boolean 
variables. More refined but still discrete, stepwise models can result from distinguishing more than 
two states for genes. For example, "off, "low", "medium" and "high". More complicated discrete 
logic would then describe the process of choosing between these four possible modes. More 
accurate approximations for sigmoidal curves are obtained by piecewise linear approximations. 
These transitions form the discrete component of the hybrid model. 

[121] A second source of discrete behavior in models of biological systems is the presence 
of an inherently discrete process. The physical laws which yield differential equation (continuous) 
models are applicable only under certain assumptions. For example, the law of mass action holds 
only when there are large number of molecules which are homogeneously mixed. 



[122] Certain molecular species are present in plenty and the law of mass action, which 
states that the rate of a reaction is proportional to the products of the concentrations of the reactants, 
can be used to describe their evolution using a differential equation. But these assumptions may not 
be true always. Inside a cell, there are dynamics that are governed by the action of only a few 
molecules. Ignoring the stochastic aspect temporarily, chemical dynamics at small numbers are best 
modeled using discrete transitions, cf. the Master equation. This again results in hybrid models of 
biological processes. 

[123] Discrete mode changes can also result from the modeling of faulty modes. In the case 
of glucose metabolism, the kidney does not excrete any glucose in normal conditions, but it starts 
excreting glucose if the level of glucose rises very high. This effect can be captured using a discrete 
transition. 

Nondeterminism and analysis of safety properties 

[124] Uncertainties and stochastic behavior are common in biology. Rate constants and 
several other parameters in models of biological systems are determined using algorithms for 
determining minimal error curve fits for available data points. For example, the rate constants A3 and 
k-j in Equation 1 and the source and sink terms in that equation are determined in this way. The 
sigmoidal curve of Figure 5 is obtained this way from the given data points. 

[125] Parameter values thus obtained are "representative" values, they do not capture all 
observed behaviors. The actual value of the parameter is possibly stochastic in a given range. In 
many cases, the interest is in knowing about all possible behaviors of the system, rather than the 
behavior of the system assuming a representative value for the parameters. For example, when 
studying the effect of insulin injections on blood glucose concentrations, all possible blood glucose 
concentrations that a human body may exhibit are of interest. In such cases uncertainties can be 
modeled using nondeterminism and the resulting model can be analyzed for all possible behaviors. 
In Figure 5 the given data points lie between the two piecewise-linear curves shown by dotted lines. 



The nondeterministic hybrid system resulting from using the two dotted lines as the approximation 
captures all the observed behaviors of the system (and possibly more). 

[126] Unknown rate constants can be modeled using unspecified symbolic constants, called 
parameters, in a hybrid formalism. The rates of reactions and other such unknown constants can be 
modeled as parameters. Numeric values for such parameters are not required for subsequent 
analysis. However, to generate nontrivial models that exhibit interesting behaviors, these parameters 
need to be constrained to take only certain values. Such constraints can be specified in the model 
using inequalities over expressions containing these parameters. This gives rise to a highly 
nondeterministic model, that is, the model can exhibit several different behaviors — one 
corresponding to every exact numerical instantiation for the parameters that is consistent with the 
constraints. Although this process of nondeterministic modeling does not accurately capture the 
stochastic nature of biological processes that arises due to random fluctuations on the small numbers 
of molecules involved, the analysis approach reintroduces some noise by assuming that the 
unknown parameters are allowed to change randomly (while still remaining consistent with the 
constraints) finitely many times. To the extent information about exact probabilities remains missing 
from the non-deterministic model, results of analysis can sometimes be coarse. 
Composition 

[127] Compositionality is an important feature required to model and analyze large models 
of any system. Larger systems are described by putting together smaller networks and component 
subsystems. Compositional modules are subsystems of a larger system that exhibit identifiable 
interfaces, are modifiable independently, and enable abstract modeling. Modularity is one of the 
crucial aspects of designing (and describing) large systems, including computer software and 
hardware systems. It permits clean and scalable description, and also helps appropriately designed 
tools in performing simulation and analysis on the models. 



[128] It is increasingly apparent that biological systems exhibit certain kinds of clean 
modularity. Biological examples of modular construction include the universal genetic code, 
translation into amino acid sequences, protein domains, operon structure, bilipid layer membranes, 
organelles, organs, communities of organisms, and signaling and metabolic pathways. Cells contain 
many different regulatory pathways, or networks of interacting proteins or other molecules, in 
several physical compartments, which interact with each other at certain well-defined points. That 
is, pathways have been identified that have identifiable interfaces with other pathways, appear to be 
modifiable independently, and enable abstract modeling. The complete behavior of some aspect of a 
cell can thus be described by putting together all the various models for the individual pathways and 
sharing the information on molecular species that are shared by two or more such subsystems. 
HybridSAL 

[129] One approach to modeling is HybridSAL - a system for hybrid system modeling and 
analysis, embodying the present invention; other approaches known to those of skill in the related 
arts are also amenable for use in the inventive approach. Models of hybrid systems can be written in 
the HybridSAL language. These models can then be analyzed for safety properties, that is, 
properties regarding all possible behaviors of the system. The analysis is done using an abstraction 
and model checking framework. This tool has been used on examples from a diverse range of 
application areas such as automobile transmissions, cruise control algorithms, collision avoidance, 
and genetic and biochemical networks. 

[130] HybridSAL can be used to compositionally build parametric hybrid models of 
biological processes. Three important features that enable effective modeling and analysis of 
regulatory pathways are the use of 

(i) discrete transitions to model activation and inhibition of transcription, 



(ii) parameters to specify unknown reaction rates in the model, with support for 
specifying constraints on these parameters to capture whatever information is 
available about the values of these parameters, and 

(iii) composition to build larger models from component models. 

Features such as these distinguish the inventive modeling approach from other currently available 
approaches. 

[131] An automatic tool for simplifying models based on the interest of the biologist 
renders the model amenable to the HybridSal tool. Complex multiscale and stochastic models of 
biological processes can be simplified to create smaller hybrid models. This process can be guided 
by the user. There are several different options for user input. The user can specify the environment 
and the parts of the model that are irrelevant under the given environment can be sliced. 
Alternatively, the user can specify the time scale of interest and the dynamics that happen either at 
very small or very large time scales (compared to what the user has specified) can be replaced by 
algebraic equations or discrete transitions. As another option, the user can specify the components 
(compartments) into which the model should be decomposed. The HybridSAL tool can then work 
on the model thus created by the model simplifier. 
Analysis 

[132] A modeling formalism is only as useful as the analysis tools that support it. The 
parametric hybrid modeling formalism enables the development of a variety of analysis tools. 
Combining discrete and continuous modeling techniques results in simpler and more composable 
models. Compositionality allows for the development of scalable tools. Parametric modeling 
languages permit the use of tools for model refinement. 
[133] Provided are analysis techniques for: 

(a) automatically creating sound approximations of the model that are smaller and 
simpler, thus amenable to more intensive (computationally complex) analysis, 



(b) proving properties, such as stability, for the models, 

(c) generating potentially interesting behaviors of the model, and 

(d) generating constraints on the unknown parameters automatically so that the 
constrained model exhibits a certain behavior. 

[134] Tools for model refinement, simplification, and simulation, along with improved 
methods of presenting abstract models and the results of hybrid analysis to biological domain experts 
are provided. 

[135] The process of creating sound abstractions is based on combining qualitative 
techniques with predicate abstraction. It is powered by powerful symbolic reasoning engines. The 
simplified model generated is a discrete finite-state transition system. It is an abstraction, in a very 
precise and rigorous sense, of the original model. The abstraction technique can be further 
optimized for linear and nonlinear systems. The second step of exploration on this finite-state 
system is carried out using model-checking. 

[136] The kinds of analysis performable on hybrid models of regulatory networks is of 
interest Such models are almost always incomplete and under specified. Hence, the analysis 
process is not a single step activity, but involves interleaved steps of a) model reduction, b) model 
analysis, and c) model refinement. 

[137] Model reduction is the process of simplifying the model based on experimental or 
domain knowledge, or based on relevance to the particular observation of interest. If interest is in a 
particular behavior of the organism, then the parts of the model that do not directly influence this 
behavior can be removed and the resulting simpler model can be used for analysis. This process of 
model reduction can be done by the user, or by specialized automated tools. 

[138] Model analysis consists of analyzing the model for exhibition of given properties. 
This is presently done in two steps. In the first step, a qualitative model is automatically extracted 
from the given parametric hybrid system model. In the second step, the extracted model is analyzed 



for the properties of interest. The qualitative model is an abstraction, in a very precise formal sense, 
of the original model. Formally, it is a discrete finite-state transition system. The second step of 
exploration on this finite-state system is carried out using model-checking. 

[139] Model refinement involves concretizing or constraining some of the unspecified 
parameters. This step is guided in two ways: using the results of the model analysis phase, or using 
some experimentally observed behavior. In the first case, the model is refined so that the unexpected 
results returned by the model analysis phase are fixed. 

[140] In the second case, given an observed behavior, the parameters of the model are 
constrained so that the observed behavior becomes a valid behavior of the model and unobserved 
behaviors are eliminated. One approach for model refinement based on the second approach is 
described. The well-known approach of counter-example guided abstraction refinement can be used 
in the first case. 
Examples 

[141] Aspects of hybrid modeling and analysis as applied to three specific biological 
examples are presented. Many other examples will occur to those of skill in the art, and these three 
are intended as illustrative applications of the inventive method 

A. Glucose metabolism in humans 

[142] The human glucose-insulin system and the model of this system proposed by Guyton 
et al. and Sorensen is used as an illustrative example. This model has been used to design a model- 
based predictive control algorithm to maintain normoglycemia, via a closed-loop insulin infusion 
pump, in the Type I diabetic patient. A formal correctness analysis of any such control algorithm can 
be established by showing that blood glucose level remains between 70 and 100 mg/dl always. For 
"representative" parameter values, this can perhaps be shown using simulations, but that analysis 
would never yield real guarantees, since parameter values vary over ranges across different 
individuals. Thus, higher assurance of bounds on behavior requires analysis over all behaviors of 

_io_ 



the corresponding nondeterministic model. Complete exploration of all behaviors of an abstracted 
system provides valuable insight beyond the partial exploration of some behaviors (eg. forward 
simulation) of a more concrete system model. 

[143] The final glucose metabolism model consists of twenty-two simultaneous nonlinear 
ordinary differential equations. It is obtained by dividing the human body into six physiologic 
compartments: brain, heart and lungs, periphery, gut, liver, and kidney. There is a state variable for 
the glucose and insulin concentration in each of these six compartments. Wherever necessary, these 
compartments are subdivided into interstitial fluid space and vascular blood space. This model is 
decomposed into three components in HybridSal, describing glucose metabolism, insulin 
metabolism, and glucagon metabolism respectively. Additionally, all nonlinearities in the model 
arise from sigmoidal functions, which can be eliminated in favor of piecewise linear approximations 
to yield a hybrid system with linear continuous dynamics. Further simplifications are possible by 
noticing that the change in glucagon concentrations is very minimal and slow compared to other 
state variables. The insulin concentrations act as inputs to the glucose module and consequently the 
insulin concentrations stabilize first, followed by the glucose concentration stabilizing. 

[144] There are two sources of insulin in the insulin metabolism model: pancreatic insulin 
release and insulin injections. If these inputs are set to zero (say, to model a diabetic patient), then 
the insulin model stabilizes at zero because there is no other source of insulin in the model. 

[145] If it is assumed that the inputs to the insulin module change very slowly compared to 
the dynamics of insulin concentration, then the system can be analyzed assuming constant inputs. 
The resulting insulin model is a linear system with one complex eigenvalue with negative real part, 
and all other eigenvalues are real and negative. This indicates that the system is stable, though it 
could exhibit some damped oscillation. Using the results to compute approximate reachability sets of 
linear systems, it is easy to compute over-approximations of reach sets for this system. The reach 
sets enable computation of a bound on the insulin concentrations. The glucose metabplism modjule 
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reduces to a linear system if the insulin inputs are fixed to their lower or upper bounds. The 
resulting linear system also has one complex and seven negative real eigenvalues. Again using 
techniques for the approximate reachability of linear systems, it is possible to compute approximate 
reach sets that bound the modeled behavior of glucose concentrations. Note that because of the 
construction of the abstractions and approximations, the bounds thus obtained are conservative and 
robust to small changes in parameter values. 
B. Subtilis sporulation initiation 

[146] The bacteria Bacillus subtilis initiates spore formation when there is a nutrient 
deficiency and the environment is not conducive to growth. The cellular commitment to sporulate is 
regulated by the complex network of transcriptional control of various genes and interactions 
between various proteins. Based on the data provided, a model of the sporulation initiation network 
of B.Subtilis was constructed. The HybridSal model consists of six components. The phosphorelay 
chain is described in one of the important components. The effect of promoters and inhibitors on 
gene regulation was captured via discrete transitions. Unknown rate constants were modeled using 
parameters. The parameters were constrained by inequalities. In some cases, the constraints were 
generated using a tool for doing quantifier-elimination over the theory of reals, called QEPCAD. As 
noted above, the constrained model is highly nondeterministic and captures a whole spectrum of 
behaviors. 

[ 1 47] The constrained parametric hybrid model of the sporulation initiation network was 
analyzed using Hybrid Abstraction and model checking for stability properties. The stability 
properties of the resulting hybrid model were observed to be highly sensitive to the discrete logic 
modeling gene regulation. The HybridAbstraction approach is partly based on qualitative methods. 
The analysis of the system using these techniques partially accounts for some stochastic behaviors 
where the unknown parameters are allowed to fluctuate finitely many times to values consistent with 
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the constraints. This results in several unexpected and interesting behaviors of the sporulation 
model 

C. Delta Notch Lateral Inhibition Mechanism 

[148] Inhibitory lateral signaling between adjacent cells (Delta Notch lateral inhibition) is 

one of the central processes responsible for cell differentiation in a cluster of identical cells. 

HybridAbstraction was applied to model Delta-Notch inhibition; the basic modeling unit a hybrid 

automata and more complex models were built using compositions over hybrid automata. Analysis 

of the model resulting by creation of a finite state discrete abstraction transition system and model 

checking the abstraction against the property of interest. The completely automated abstraction 

technique employed suitable decision procedures and theorem- provers. Figure 3 depicts the abstract 

SAL transition corresponding to the continuous dynamics in the mode delta = false, notch=false of 

the Delta-Notch one cell model. A two cell complex may be created by composing two single cells. 

Because the SAL modeling language name variables occur in two instances of the one cell model, 

some renaming of variables not local to the module is necessary to avoid conflicts. Similarly, 

communication between the two cells is captured by renaming variables by the same name. 

twocells: MODULE 
LOCAL vdl , vd2 IN 

( (RENAME g4 TO vd2, g5 TO vdl IN cell) 
[] 

(OUTPUT delta2, notch 2 IN 

(RENAME g4 TO vdl, g5 TO vd2, 

delta TO delta2 

notch TO notch2 IN cell ) ) ) ; 

[149] The composition operator [], is used to construct the module 'twocells" using two 
instances of the module "cell." Connecting the output variable g5 of one cell to the input variable g4 



of another cell is done by renaming them to the same name. The variables delta2 and notch2 

describe the mode of the second cell. 

[150] The module 'twocells" can be model-checked against the stability properties as in the 

one cell model. The properties of the two cell model are verified as follows: 

stability4: THEOREM 
twocells | - 

G ( (delta AND notch - FALSE AND 

delta2 - FALSE AND notch2) - 

G (delta AND notch - FALSE AND 

delta2 - FALSE AND notch2) ) ; 

stability5 : THEOREM 

twocells | - 

G ( ( delta - FALSE AND notch AND 

delta2 AND notch2 - FALSE) - 

G (delta - FALSE AND notch AND 

delta2 AND notch2 - FALSE) ) ; 

[151] This shows that the states where one cell has high Delta and low Notch concentration, 
while the other has low Delta and high Notch concentration are stable states for a two cell complex. 
The other states are shown to be unstable, as model-checking the corresponding properties yields a 
counter-example. However, the property that the two cell system eventually reaches one of the 
equilibrium states does not hold true, and the model-checker produces a counter-example. 

[ 1 52] The counter example reveals that the property falsifying trajectory corresponds to an 
oscillatory behavior of the system. The abstract transition system can be interpreted as exhibiting 



the sum total of all behaviors of specific concrete realizations of the original system (using specific 
initialization values and parameter values consistent with the assumptions made during construction 
of the abstraction). The computational time in creating the abstract and the model checking time 
was several seconds real clock time. 

[1 53] Symbolic Systems Biology promotes the construction and experimental validation of 
models and analyses that explain and predict the behavior of biological systems. Symbolic Systems 
Biology is characterized by a synergistic integration of theory, computation, and experiment. 

[1 54] Only through such an interdisciplinary approach can a scalable, rigorous, and 
systematic understanding of complex biological processes be achieved. Hybrid discrete-continuous 
formalisms can be used to provide access to computational analysis enabling accurate modeling of 
some of the dynamics of biological systems. Together with increasing access to biological network 
information (through exponentially growing databases and BioSPICE and related tool platforms) and 
qualitative modeling and analysis techniques, hybrid modeling and analysis of the computation and 
control of cells, tissues, and organisms may enable Symbolic Systems Biology to be useful to 
biologists. HybridAbstraction is integral to hybrid modeling and analysis in complex biological 
systems. 
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